auxiliary distribution
- Asia > Middle East > Jordan (0.05)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Asia > Middle East > Jordan (0.05)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Beijing > Beijing (0.04)
Anchored Supervised Fine-Tuning
Zhu, He, Su, Junyou, Lai, Peng, Ma, Ren, Zhang, Wenjia, Yang, Linyi, Chen, Guanhua
Post-training of large language models involves a fundamental trade-off between supervised fine-tuning (SFT), which efficiently mimics demonstrations but tends to memorize, and reinforcement learning (RL), which achieves better generalization at higher computational cost. Dynamic Fine-Tuning (DFT) recently emerged as a promising middle ground, reweighting SFT objectives with token probabilities and achieving improvements in certain reasoning domains, though it exhibits instability in other tasks. We provide a analysis of DFT through the reward-weighted regression (RWR) framework, revealing that it corresponds to a specific auxiliary distribution choice that yields provably tighter RL bounds than standard SFT. However, our analysis also uncovers a critical limitation: this construction lacks distributional anchoring, leading to progressive drift that undermines training stability. To address this, we propose Anchored Supervised Fine-T uning (ASFT), which augments DFT's reweighting with lightweight KL regularization to preserve tightness while ensuring stability. Empirically, ASFT consistently outperforms both SFT and DFT across mathematical reasoning, medical knowledge grounding, and code generation, achieving substantial improvements with minimal computational overhead. Our RWR framework provides a systematic lens for understanding post-training methods and demonstrates that principled theoretical analysis leads to both stronger guarantees and practical gains. Large language models (LLMs) have become a central substrate for modern AI systems, powering instruction following, tool use, and multi-step reasoning at scale (Brown et al., 2020; Touvron et al., 2023; Achiam et al., 2023; Guo et al., 2025).
Sequential Monte Carlo for Graphical Models
We propose a new framework for how to use sequential Monte Carlo (SMC) algorithms for inference in probabilistic graphical models (PGM). Via a sequential decomposition of the PGM we find a sequence of auxiliary distributions defined on a monotonically increasing sequence of probability spaces. By targeting these auxiliary distributions using SMC we are able to approximate the full joint distribution defined by the PGM. One of the key merits of the SMC sampler is that it provides an unbiased estimate of the partition function of the model. We also show how it can be used within a particle Markov chain Monte Carlo framework in order to construct high-dimensional block-sampling algorithms for general PGMs.
Non-exchangeable Conformal Prediction with Optimal Transport: Tackling Distribution Shifts with Unlabeled Data
Correia, Alvaro H. C., Louizos, Christos
Conformal prediction is a distribution-free uncertainty quantification method that has gained popularity in the machine learning community due to its finite-sample guarantees and ease of use. Its most common variant, dubbed split conformal prediction, is also computationally efficient as it boils down to collecting statistics of the model predictions on some calibration data not yet seen by the model. Nonetheless, these guarantees only hold if the calibration and test data are exchangeable, a condition that is difficult to verify and often violated in practice due to so-called distribution shifts. The literature is rife with methods to mitigate the loss in coverage in this non-exchangeable setting, but these methods require some prior information on the type of distribution shift to be expected at test time. In this work, we study this problem via a new perspective, through the lens of optimal transport, and show that it is possible to estimate the loss in coverage and mitigate it in case of distribution shift.
- Europe > Netherlands > South Holland > Delft (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
Sequential Monte Carlo for Graphical Models
We propose a new framework for how to use sequential Monte Carlo (SMC) algorithms for inference in probabilistic graphical models (PGM). Via a sequential decomposition of the PGM we find a sequence of auxiliary distributions defined on a monotonically increasing sequence of probability spaces. By targeting these auxiliary distributions using SMC we are able to approximate the full joint distribution defined by the PGM. One of the key merits of the SMC sampler is that it provides an unbiased estimate of the partition function of the model. We also show how it can be used within a particle Markov chain Monte Carlo framework in order to construct high-dimensional block-sampling algorithms for general PGMs.
On Convergence of the Alternating Directions SGHMC Algorithm
Ghosh, Soumyadip, Lu, Yingdong, Nowicki, Tomasz
We study convergence rates of Hamiltonian Monte Carlo (HMC) algorithms with leapfrog integration under mild conditions on stochastic gradient oracle for the target distribution (SGHMC). Our method extends standard HMC by allowing the use of general auxiliary distributions, which is achieved by a novel procedure of Alternating Directions. The convergence analysis is based on the investigations of the Dirichlet forms associated with the underlying Markov chain driving the algorithms. For this purpose, we provide a detailed analysis on the error of the leapfrog integrator for Hamiltonian motions with both the kinetic and potential energy functions in general form. We characterize the explicit dependence of the convergence rates on key parameters such as the problem dimension, functional properties of both the target and auxiliary distributions, and the quality of the oracle.
- North America > United States > New York (0.04)
- Europe > Romania > Sud-Est Development Region > Constanța County > Constanța (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
Channel Estimation in RIS-Enabled mmWave Wireless Systems: A Variational Inference Approach
Fredj, Firas, Feriani, Amal, Mezghani, Amine, Hossain, Ekram
Channel estimation in reconfigurable intelligent surfaces (RIS)-aided systems is crucial for optimal configuration of the RIS and various downstream tasks such as user localization. In RIS-aided systems, channel estimation involves estimating two channels for the user-RIS (UE-RIS) and RIS-base station (RIS-BS) links. In the literature, two approaches are proposed: (i) cascaded channel estimation where the two channels are collapsed into a single one and estimated using training signals at the BS, and (ii) separate channel estimation that estimates each channel separately either in a passive or semi-passive RIS setting. In this work, we study the separate channel estimation problem in a fully passive RIS-aided millimeter-wave (mmWave) single-user single-input multiple-output (SIMO) communication system. First, we adopt a variational-inference (VI) approach to jointly estimate the UE-RIS and RIS-BS instantaneous channel state information (I-CSI). In particular, auxiliary posterior distributions of the I-CSI are learned through the maximization of the evidence lower bound. However, estimating the I-CSI for both links in every coherence block results in a high signaling overhead to control the RIS in scenarios with highly mobile users. Thus, we extend our first approach to estimate the slow-varying statistical CSI of the UE-RIS link overcoming the highly variant I-CSI. Precisely, our second method estimates the I-CSI of RIS-BS channel and the UE-RIS channel covariance matrix (CCM) directly from the uplink training signals in a fully passive RIS-aided system. The simulation results demonstrate that using maximum a posteriori channel estimation using the auxiliary posteriors can provide a capacity that approaches the capacity with perfect CSI.
- North America > United States (0.14)
- North America > Canada > Manitoba (0.04)
- Asia > China (0.04)
- Telecommunications (0.68)
- Information Technology (0.66)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Hierarchical Semi-Implicit Variational Inference with Application to Diffusion Model Acceleration
Yu, Longlin, Xie, Tianyu, Zhu, Yu, Yang, Tong, Zhang, Xiangyu, Zhang, Cheng
Semi-implicit variational inference (SIVI) has been introduced to expand the analytical variational families by defining expressive semi-implicit distributions in a hierarchical manner. However, the single-layer architecture commonly used in current SIVI methods can be insufficient when the target posterior has complicated structures. In this paper, we propose hierarchical semi-implicit variational inference, called HSIVI, which generalizes SIVI to allow more expressive multi-layer construction of semi-implicit distributions. By introducing auxiliary distributions that interpolate between a simple base distribution and the target distribution, the conditional layers can be trained by progressively matching these auxiliary distributions one layer after another. Moreover, given pre-trained score networks, HSIVI can be used to accelerate the sampling process of diffusion models with the score matching objective. We show that HSIVI significantly enhances the expressiveness of SIVI on several Bayesian inference problems with complicated target distributions. When used for diffusion model acceleration, we show that HSIVI can produce high quality samples comparable to or better than the existing fast diffusion model based samplers with a small number of function evaluations on various datasets.